Exploring feature set combinations for WSD

نویسندگان

  • Eneko Agirre
  • Oier Lopez de Lacalle
  • David Martínez
چکیده

This paper explores the split of features sets in order to obtain better wsd systems through combinations of classifiers learned over each of the split feature sets. Our results show that only k-nn is able to profit from the combination of split features, and that simple voting is not enough for that. Instead we propose combining all k-nn subsystems where each of the k neighbors casts one vote. We have performed a thorough evaluation on two datasets (Senseval-3 Lexical-Sample and All-words), having set the best combination options in a development dataset (Senseval-2 Lexical-Sample). The results for the All-Words task are the best published up to date. The results for the lexical sample are state-of-the-art.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring feature sets for Turkish word sense disambiguation

This paper presents an exploration and evaluation of a diverse set of features that influence word-sense disambiguation (WSD) performance. WSD has the potential to improve many natural language processing (NLP) tasks as being one of the most crucial steps in the area. It is known that exploiting effective features and removing redundant ones help improving the results. There are two groups of f...

متن کامل

Unsupervised Domain Adaptation for Word Sense Disambiguation using Stacked Denoising Autoencoder

In this paper, we propose an unsupervised domain adaptation for Word Sense Disambiguation (WSD) using Stacked Denoising Autoencoder (SdA). SdA is an unsupervised learning method of obtaining the abstract feature set of input data using Neural Network. The abstract feature set absorbs the difference of domains, and thus SdA can solve a problem of domain adaptation. However, SdA does not always c...

متن کامل

Exploring the Effect of Bag-of-words and Bag-of-bigram Features on Turkish Word Sense Disambiguation

Feature selection in Word Sense Disambiguation (WSD) is as important as the selection of algorithm to remove sense ambiguity. Bag-of-word (BoW) features comprise the information of neighbors around the ambiguous target word without considering any relation between words. In this study, we investigate the effect of BoW features and Bag-of-bigrams (BoB) on Turkish WSD and compare the results with...

متن کامل

Research Paper: A Multi-aspect Comparison Study of Supervised Word Sense Disambiguation

OBJECTIVE The aim of this study was to investigate relations among different aspects in supervised word sense disambiguation (WSD; supervised machine learning for disambiguating the sense of a term in a context) and compare supervised WSD in the biomedical domain with that in the general English domain. METHODS The study involves three data sets (a biomedical abbreviation data set, a general ...

متن کامل

CITYU-HIF: WSD with Human-Informed Feature Preference

This paper describes our word sense disambiguation (WSD) system participating in the SemEval-2007 tasks. The core system is a fully supervised system based on a Naïve Bayes classifier using multiple knowledge sources. Toward a larger goal of incorporating the intrinsic nature of individual target words in disambiguation, thus introducing a cognitive element in automatic WSD, we tried to fine-tu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Procesamiento del Lenguaje Natural

دوره 37  شماره 

صفحات  -

تاریخ انتشار 2006